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EXTREMAL QUANTILE REGRESSION 1 

By Victor Chernozhukov 
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Quantile regression is an important tool for estimation of con- 
ditional quantiles of a response Y given a vector of covariates X. 
It can be used to measure the effect of covariates not only in the 
center of a distribution, but also in the upper and lower tails. This 
paper develops a theory of quantile regression in the tails. Specif- 
ically, it obtains the large sample properties of extremal (extreme 
order and intermediate order) quantile regression estimators for the 
linear quantile regression model with the tails restricted to the do- 
main of minimum attraction and closed under tail equivalence across 
regressor values. This modeling setup combines restrictions of ex- 
treme value theory with leading homoscedastic and heteroscedastic 
linear specifications of regression analysis. In large samples, extreme 
order regression quantiles converge weakly to argmin functionals of 
stochastic integrals of Poisson processes that depend on regressors, 
while intermediate regression quantiles and their functionals converge 
to normal vectors with variance matrices dependent on the tail pa- 
rameters and the regressor design. 

1. Introduction. Regression quantiles [Koenker and Bassett (1978)] es- 
timate conditional quantiles of a response variable Y given regressors X. 
They extend Laplace's (1818) median regression (least absolute deviation 
estimator) and generalize the ordinary sample quantiles to the regression 
setting. Regression quantiles are used widely in empirical work and stud- 
ied extensively in theoretical statistics. See, for example, Buchinsky (1994), 
Chamberlain (1994), Chaudhuri, Doksum and Samarov (1997), Gutenbrunner and Jureckova 
(1992), Hendricks and Koenker (1992), Knight (1998), Koenker and Portnoy 
(1987), Portnoy and Koenker (1997), Portnoy (1991a) and Powell (1986), 
among others. 
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Many potentially important applications of regression quantiles involve 
the study of various extremal phenomena. In econometrics, motivating ex- 
amples include the analysis of factors that contribute to extremely low infant 
birthweights [cf. Abrevaya (2001)]; the analysis of the highest bids in auc- 
tions [cf. Donald and Paarsch (1993)]; and estimation of factors of high risk 
in finance [cf. Tsay (2002) and Chernozhukov and Umantsev (2001), among 
others]. In biostatistics and other areas, motivating examples include the 
analysis of survival at extreme durations [cf. Koenker and Geling (2001)]; 
the analysis of factors that impact the approximate boundaries of biolog- 
ical processes [cf. Cade (2003)]; image reconstruction and other problems 
where conditional quantiles near maximum or minimum are of interest [cf. 
Korostelev, Simar and Tsybakov (1995)]. 

An important peril to inference in the listed examples is that conven- 
tional large sample theory for quantile regression does not apply sufficiently 
far in the tails. In the nonregression case, this problem is familiar, well doc- 
umented and successfully dealt with by modern extreme value theory; see, 
for example, Leadbetter, Lindgren and Rootzen (1983), Resnick (1987) and 
Embrechts, Kliippelberg and Mikosch (1997). The purpose of this paper is 
to develop an asymptotic theory for quantile regression in the tails based 
on this theory. Specifically, this paper obtains the large sample properties 
of extremal (extreme order and intermediate order) quantile regression for 
the class of linear quantile regression models with conditional tails of the 
response variable restricted to the domain of minimum attraction and closed 
under the tail equivalence across conditioning values. 

The paper is organized as follows. After an introductory Section 2, Sec- 
tion 3 joins together the linear quantile regression model with the tail re- 
strictions of modern extreme value theory. These restrictions are imposed 
in a manner that allows regressors to impact the conditional tail quantiles 
of response Y differently than the central quantiles. The resulting mod- 
eling setup thus covers conventional location shift regression models, as 
well as more general quantile regression models. Section 4 provides the 
asymptotic theory for the sample regression quantiles under the extreme 
order condition, ttT — > k > 0, where tt is the quantile index and T is the 
sample size. By analogy with the extreme order quantiles in nonregression 
cases, the extreme order regression quantiles converge to extreme type vari- 
ates (functionals of multivariate Poisson processes that depend on regres- 
sors). Our analysis of the case ttT — > k > builds on and complements 
the analysis of ttT — > given by Feigin and Resnick (1994), Smith (1994), 
Portnoy and Jureckova (1999) and Knight (2001) for various types of loca- 
tion shift models. [Chernozhukov (1998) also studied some nonparametric 
cases.] Section 5 derives the asymptotic distributions of regression quantiles 
under the intermediate order condition: ttT — > oo,ry — ► 0, thus providing a 
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quantile regression analog of the results on the intermediate univariate quan- 
tiles by Dekkers and de Haan (1989). As with the intermediate quantiles in 
nonregression cases, the intermediate order regression quantiles, and their 
functionals such as Pickands type estimators of the extreme value index, 
analyzed in Section 6, are asymptotically normal with variance determined 
by both the tail parameters and the regressor design. Section 7 provides an 
illustration, Section 8 concludes, and Section 9 collects the proofs. 

2. The setting. Suppose Y is the response variable in R, and X = (1, X'_i)' 
is a d X 1 vector of regressors (typically transformations of original regres- 
sors). (Throughout the paper, given a vector x, X-\ denotes x without its 
first component x\.) Denote the conditional distribution of Y given X = x 
by Fy(-\x). The present focus is on Fy X {r\x) = inf{y : Fy (y\x) > r}, where 
r is close to 0. Let there be a sample 

{Y t ,X t ,t = l,...,T} where A 4 G X, 

generated by a probability model with a conditional quantile function of the 
classical linear-in-parameter form 

(2.1) Fy 1 (t\x) = x'(3(t) for all t G X, x G X, 

where /?(•) is a nonparametric function of r, which when X = (0, 1) also 
corresponds to the stochastic model with random coefficients: 

(2.2) Y = X'(3{e), e = U(0, 1),1gX. 
Here it is necessary that (2.1) holds for 

(2.3) X = [0, rj\ for some < rj < 1 and x G X, a compact subset of M. d . 

Different linear models (2.1) can be applied to different covariate regions X 
[which can be local neighborhoods of a given xq, in which case the linear 
model (2.1) is motivated as a Taylor expansion]. The model (2.1) plays a 
fundamental role in the theoretical and practical literature on quantile re- 
gression mentioned in the Introduction. Its appealing feature is the ability to 
capture quantile-specific covariate effects in a convenient linear framework. 

In the sequel, we combine the linear model (2.1) with the tail restric- 
tions from extreme value theory to develop applicable asymptotic results. 
It is of vital consequence to impose these restrictions in a manner that pre- 
serves the quantile-specific covariate effects, as motivated by the empirical 
examples listed in the Introduction. For instance, in the analysis of U.S. 
birthweights, Abrevaya (2001) finds that smoking and the absence of pre- 
natal care impact the low conditional quantiles of birthweights much more 
negatively than the central birthweight quantiles. The linear framework (2.1) 
is able to accommodate this type of impact through the quantile-specific co- 
efficients /3(t), where /3_i(r), for r near 0, describes the effect of covariate 
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factors on extremely low birthweights and, say, /3__i (1/2) describes the effect 
on central birthweights. Thus, when imposing extreme value restrictions, it 
is important to preserve this ability. 

The inference about (3{t) is based on the regression quantile statistics 
(3(t) [Koenker and Bassett (1978)] defined by the least asymmetric absolute 
deviation problem: 

T 

(2.4) P(t) G arg min V o T (Y t - Xi/3) where p T (u) = (t-1(u< 0))u, 

of which Laplace's (1818) median regression is an important case with pi/2(u) = 

\u\/2. The statistics (3{t) naturally generalize the ordinary sample quantiles 
to the conditional setting. In fact, the usual univariate r-quantiles can be 
recovered as the solution to this problem without covariates, that is, when 
X t = 1. [E.g., if tT G (0, 1), /3(r) = y (1) , and if tT G (1, 2), (3{r) = Y (2) , etc.] 

In order to provide large sample properties of (3{t) in the tails, we distin- 
guish three types of sample regression quantiles, following the classical the- 
ory of order statistics: (i) an extreme order sequence, when tt \ 0, ttT — > 
k > 0, (ii) an intermediate order sequence, when tt \ 0, ttT — > oo, (iii) a 
central order sequence, when r G (0, 1) is fixed, and T — > oo (under which 
the conventional theory applies). We consider (3(tt) under the extreme and 
intermediate order sequences, and refer to (3{tt) under both sequences as the 
extremal regression quantiles. In what follows, we omit the T in tt whenever 
it does not cause confusion. 

3. The extreme value restrictions on the linear quantile regression model. 

This section joins the linear model (2.1) together with the tail restrictions 
from extreme value theory, examines the consequences and presents exam- 
ples. 

Consider a random variable u with distribution function F u and lower 
end-point s u = or s u = —oo. Recall [cf. Resnick (1987)] that F u is said to 
have tail of type 1, 2 or 3 if for 

type 1: as z \ s u = or — oo, 

F u {z + va{z))~F u {z)e v V«GlR,£ = 0, 

type 2: as z \ s u = — oo, 

(3 1) 

F u (yz)~v- l ltF u {z) Vv>0,£>0, 
type 3: as z \ s u = 0, 

F u (vz)~v~ 1 ^F u (z) Vv>0,£<0, 

where a(z) = F u (v) dv / F u (z) , for z > s u . The number £ is commonly 
called the extreme value index, and F u with tails of types 1-3 is said to 
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belong to the domain of minimum attraction. [a(z) ~ b(z) denotes that 
a{z)/b{z) — > 1 as a specified limit over z is taken.] 

Condition Rl. In addition to (2.1), there exists an auxiliary line 
x'/3 r such that for 

(3.2) U = Y-X% with sc/ = or sc/ = -oo, 
and some -F u with type 1, 2 or 3 tails, 

(3.3) Fu(z\x) ~ K(x) ■ F u (z) uniformly in x G X , as z \ sjj, 

where K (•) > is a continuous bounded function on X. Without loss of 
generality, let K(x) = 1 at x = fix and F u (z) = Fu(z\nx)- 

Condition R2. The distribution function of X = (l,X'_i)', Fx, has 
compact support X with EXX' positive definite. Without loss of generality, 
let fj, x = EX = (1,0, ... ,0)' . 

When Y has a finite lower endpoint, that is, X' (3(0) > — oo, it is implicit 
in Condition Rl that (3 r = (3(0) so that U = Y — X'/3(0) > has endpoint 
by construction. In the unbounded support case, X' (3(0) = — oo and is 
not suitable as an auxiliary line, but existence of any other line such that 
Condition Rl holds suffices. 

Condition Rl is the main assumption. First, Condition Rl requires the 
tails of U = Y — X'(3 r for some (3 r to be in the domain of minimum attrac- 
tion, which is a nonparametric class of distributions [cf. Resnick (1987) and 
Embrechts, Kliippelberg and Mikosch (1997)]. In this sense, the specifica- 
tion Condition Rl is semiparametric. Examples 3.1 and 3.2 present some of 
the regression models covered by Condition Rl. Second, Condition Rl also 
requires that, for any x',x" G X, z i— > Fjj(z\x') and z \— > Fjj(z\x") are tail 
equivalent up to a constant. This condition is motivated by the closure of 
the domain of minimum attraction under tail equivalence [cf. Proposition 
1.19 in Resnick (1987)]. 

Compactness of X in Condition Rl is necessary, as the limit theory for 
regression quantiles may generally change otherwise. In applications, com- 
pactness may be imposed by the explicit trimming of observations depending 
on whether Xt G X. In this case the linear model (2.1) is assumed to apply 
only to values of X in X. Clearly, the smaller X, the less restrictive is the 
linear model by virtue of Taylor approximation [e.g., Chaudhuri (1991)]. 
Also, trimming X to X eliminates the impact of outlying values on the 
limit distribution and inference, as it does in the case of the central regres- 
sion quantiles. In some cases it should be possible to make X unbounded 
by imposing higher level nonprimitive conditions, for example, similar to 
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those on page 98 in Knight (2001). However, since we view X as a "small" 
neighborhood over which the linear approximation (2.1) is adequate, we do 
not pursue this extension. 

Theorem 3.1 shows that the function K(x) in Condition Rl can be repre- 
sented by the following types. Other properties of the linear quantile regres- 
sion model under Conditions Rl and R2 are obtained in Lemma 9.1 given 
in Section 9.1. 

Theorem 3.1 [Three types of if (■)]• Under Conditions Rl and R2, for 
some c G M. d , 

!e~ x ' c , when F u has type 1 tails, £ = 0, 

(x'c) 1 ^, when F u has type 2 tails, £ > ; 
(x'c) 1 ^, when F u has type 3 tails, £ < 0, 

where ^' x c = 1 f or type 2 and 3 tails, fx' x c = for type 1 tails, and x'c > 
for ullieX for types 2 and 3. 

Remark 3.1. The condition X'c > a.s. for tails of types 2 and 3 arises 
from the linearity assumption (2.1). Indeed, (2.1) imposes that the quantiles 
should not cross: if I > 1, then X'([3(It) — @(t)) > a.s. Since by Lemma 
9.1 (v) X'(P(It) - P(t))/h' x (J3(It) - /9(r)) -> X'c as r \ 0, the noncrossing 
condition requires X'c > a.s. In location-scale shift models (cf. Example 
3.2), the condition X'c > a.s. is equivalent to a logical restriction on the 
scale function (X'a > a.s.). In location shift models (cf. Example 3.1), this 
condition is ordinarily satisfied since X'c = 1 a.s. for tails of types 2 and 3. 

Remark 3.2. The general case when P{K(X) / 1} > will be referred 
to as the heterogeneous case, and c will be referred to as the heterogeneity 
index. The special case with 

(3.5) K (X) = 1 a.s. 

will be referred to as the homogeneous case. The latter amounts to c = 
for type 1 tails, and c = = (1, 0, . . . )' for type 2 and 3 tails. Notice that in 
this case X'c = 1 a.s. for types 2 and 3 and X'c = a.s. for type 1 tails. 

In developing regularity conditions which target regression applications, 
it is natural to try to cover the most conventional regression settings and, 
hopefully, more general stochastic specifications. The following examples 
clarify this possibility. 

Example 3.1 (Location shift regression). Consider the location-shift 
model 



(3.6) 



Y = X'fi + U, 
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where U is independent of X, and suppose U is in the domain of mini- 
mum attraction. When the lower endpoint of the support of U is finite, it 
is normalized to 0. Clearly, this is a special case of Condition Rl where 
X% = X'f3, U = Y- X'P,K(X) = 1 a.s. The data generating process (3.6) 
has been widely adopted in regression work at least since Huber (1973) and 
Rao (1965). A variety of standard survival and duration models also im- 
ply (3.6) after a transformation, for example, the Cox models with Weibull 
hazards and accelerated failure time models [cf. Doksum and Gasko (1990)]. 
Also, (3.6) underlies many theoretical studies of quantile regression. Hence, 
it is useful that Condition Rl covers (3.6). 

Example 3.2 (Location-scale shift regression). As a generalization of (3.6), 
consider the stochastic equation 

(3.7) Y = X'f3 + X'a -V, V is independent of X, 

where X'a > (a.s.) is the scale function, and V is in the domain of minimum 
attraction with £ ^ 0. (3.7) implies the following linear conditional quantile 
function 

(3.8) F-\t\X)=X'P + X'o--F- 1 (t). 

Then for X' (3 r = X'(3, U = Y- X'(3 r = X'a ■ V, we have P(X'a ■ V < z\X) ~ 
(X'a) 1 ^- F v {z) as z\0 or — oo, so Condition Rl is satisfied with F u = 
Fy and K(X) = (X'a) 1 ^. The data generating process (3.7) has been adopted 
in, for example, Koenker and Bassett (1982), Gutenbrunner and Jureckova 
(1992) and He (1997). 

Example 3.3 (Quantile- shift regression). To see that Condition Rl cov- 
ers more general stochastic models than (3.6) and (3.7), note that Condi- 
tion Rl requires that Fu(u\X) or Fy(u\X) be independent of X only in the 
tails. In both cases, these weaker independence requirements allow X, for ex- 
ample, to have a negative impact on the high and low quantiles but to have a 
positive impact on the median quantiles. In contrast, notice from (3.8) that 
(3.6) and (3.7) preclude such quantile-specific impacts. Thus, Condition Rl 
preserves the heterogeneous impact property of (2.1), allowing the impact of 
covariate factors on extreme quantiles to be very different from their impact 
on the central quantiles. 

4. Asymptotics of extreme order regression quantiles. Consider sequences 
Ti, i = 1, . . . , I, such that t{F — > ki > as T — > oo, and the corresponding nor- 
malized regression quantile statistics Zj-(ki), where 



(4.1) 



Z T (k) = a T 0(T)-Pr-bTei), 
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$(t) is the regression quantile, (3 r is the coefficient of the auxiliary line 
defined in (3.2), e! = (1, 0, . . . )' £ W 1 , and (ay, bx) are the canonical normal- 
ization constants, given by 



F- 1 - 
T 



for type 1 tails: ar = 1/a 
(4.2) for type 2 tails: a T = ~l/F~ l 

for type 3 tails: a T = 1/F' 1 (J^J , 



b T = 0, 
b T = 0, 



where F u is defined in Condition Rl. Moreover, consider the centered statis- 
tic 

(4.3) Zm=^T0(-r)-P(r)) 
and the point process, for Ut = Y t — X' t /3 r , 



(4.4) N(-) = J2U{a T (U t -b T ),X t }e-). 

We will show that N(-) converges weakly to the Poisson process 

oo 

(4.5) N(-)=X>({Ji, 

i=l 

with points {Ji,Xi} satisfying 

{(ln(rj) + X[c, Xi), for type 1 tails, 
l-r^X!c,Xi), for type 2 tails, i > 1, 

(Tf € Af/c, Xi), for type 3 tails, 

where {Xi} is an i.i.d. sequence with law Fx, 



(4.7) 



i > 1, 



and {£j} is an i.i.d. sequence of unit-exponential variables, independent of 
{Xi}. In the homogeneous case (3.5), Jj and Xi are independent since 



(4.8) 



X'c 



0, for type 1 tails, 

1, for type 2 and 3 tails, 



for all i > 1. 



The following theorem establishes the weak limit of Zt(&)'s as a function of 
N. 
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Theorem 4.1 (Extreme order regression quantiles). Assume Conditions 
Rl and R2 and that {Yt,Xt} is an i.i.d. or a stationary sequence satisfying 
the Meyer type conditions of Lemma 9.4. Then as tT — > k > and T — > oo, 

(4.9) Zx(k) Z oc (k) = argmin — k/j,' x z + / (x'z — u) + dN(u, x) 

z&Z [ J 

provided Zoo(k) is a uniquely defined random vector in Z, where (x'z — 
u) + = t(u < x'z)(x'z — u), Z = M. d for type 1 and 3 tails, and Z = {z G 
W 1 : max^gx z'x < 0} for type 2 tails. Moreover, 

(4.10) z c T {h) 4 z^ik) = z^k) - V (k), 

where 

{c + In ke\ , for type 1 tails, 

— k~^c, for type 2 tails, 

k~^c, for type 3 tails. 

If Z ao (k) is a uniquely defined random vector for k = ki, ■ ■ ■ , k\, 
(Z T (kx)', Z T (h)')' 4- (Z^ki)', Z 00 (ki) / )', 

(z^hy, zHk)')' ^ (z^ihY, z^ik)')'. 

Remark 4. 1 ( The limit criterion function) . The limit objective function 
—k/i' x z + J (x'z — u) + dN(n, x) can also be written as 

oo 

(4.12) -kfjf x z + ^(Xiz-J^ + . 

i=l 

Remark 4.2 (Homogeneous case). The limit result is simpler for the 
homogeneous case (3.5), since N does not depend on the heterogeneity pa- 
rameter c due to (4.8). 

Remark 4.3 (Case with tT — >0). The linear programming estimator, 
which corresponds to Tr — > in (2.4) (in comparison, here tT — > k > 0), was 
studied in Feigin and Resnick (1994), Smith (1994), Portnoy and Jureckova 
(1999), Knight (1999, 2001) and Chernozhukov (1998) under various types 
of location-shift specification (3.6). This estimator is the solution to the 
problem 

T 

(4.13) max X' (3 such that Y t > X' t [5 for allt < T, X = T~ 1 V X t . 

The asymptotics of (4.13) and proofs differ substantively from the ones given 
here for tT — > k > 0. The analysis of tT — > k > is specifically motivated by 
the applications listed in the Introduction. 
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Remark 4.4 (Uniqueness). The limit objective function is convex, and 
it is assumed in Theorem 4.1 that Z OQ (k) is unique and tight. Lemma 9.7 
shows that a sufficient condition for tightness is the design condition of 
Portnoy and Jureckova (1999). Taking tightness as given, conditions for 
uniqueness can be established. Define TL as the set of all d-element subsets 
of N. For h£TL, let X(h) and J(h) be the matrix with rows X%,t G h, and 
vector with elements Jt,t G h, respectively. Let TL* = {h £TL: \X(h)\ ^ 0}. 
TL* is nonempty a.s. by Condition R2 and is countable. Application of the ar- 
gument of Theorem 3.1 of Koenker and Bassett (1978) gives that an argmin 
of (4.12) takes the form Zh = X(h)~ 1 J(h) for some h G TL* , and must satisfy 
the gradient condition 

(4.14) Ck(z h ) = (kfix ~ £ H Jt < XtZh)*kJ X(h)~ l G [0, l] d , 

where the argmin is unique iff Ck( z h) £ ^ = (0) l) d - Thus, uniqueness holds 
for a fixed k > if 

(4.15) P{(k(z h ) G dV for some h G TL*) = 0. 

This condition is a direct analog of Koenker and Bassett's (1978) condition 
for uniqueness in finite samples; for instance, it is satisfied for a given k 
when covariates X-u are absolutely continuous [cf. Portnoy (1991b)]. Thus, 
uniqueness holds generically in the sense that for a fixed k adding arbitrarily 
small absolutely continuous perturbations to ensures (4.15). 

Remark 4.5 (Asymptotic density). The density of Z QO (k) can be stated 
following Koenker and Bassett (1978). Given {X t }, h G TL*, and J(h), the 
probability that Z OQ (k) = X(h)~ l J(h) equals P{C k {X(h)- 1 J(h)) G V\{X t }, J(h)}. 
Conditional on {Z oa (k) = X(h)~ 1 J(h)}, h G TL*, and X(h), the density of 
Zoo(k) at z is fj( h )\x(h)(X{h)z) ■ \X(h% where fj( h )\ X ( h )(u),u G M. d , is the 
joint density of J(h) conditional on X(h). Thus, the joint density of Z^k) 
at z is 



fz oa (k)(z) = E 



E fm\x(h)(x(h)z)-\x(h)\ 

heH* 

x P{Q k (X(h)- l J(h)) G V\{X t }, J(h)} 



Finally, for fz ao (k)( z ) to be nondefective, Z^k) = O p (l) should be estab- 
lished (cf. Lemma 9.7). 



Remark 4.6 (Univariate case). The density simplifies in the classical 
nonregression case, that is, when X = 1, in which case we also have the 
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simplification (4.8). In this case, an argmin is necessarily an order statistic, 
that is, Zh = J(h) = Jh', the gradient condition (4.14) becomes 

(4.16) ( k (z h ) =(^-11 HJt < Zh)J € [0, 1]; 

and the condition for uniqueness is that Ck( z h) G 2> = (0, 1). Then, for k ^ 
\k] , P{Ck(z h ) G V] = 1 if h = \k] and P{Ck{z h ) G V} = if h + \k] . Here k ^ 
\k~\ is needed for uniqueness. Hence, fz oc (k)( z ) = /j rfel ( z )i which is the limit 
density of the \k~\ th order statistics in the univariate case. Thus, uniqueness 
holds for almost every k € (0, oo). 

5. Asymptotics of intermediate order regression quantiles. In order to 
develop asymptotic results for the intermediate regression quantiles, the fol- 
lowing additional Condition R3 will be added. First, existence of the quantile 
density function dF^ l {r\x) / dr = x 1 df3(r) / dr and its regular variation will 
be required. Second, the tail equivalence of the conditional distribution func- 
tions, previously assumed in Condition Rl, will now be strengthened to the 
tail equivalence of conditional quantile density functions. 

Condition R3. In addition to Conditions Rl and R2, for £ defined in 
(3.1), 



... dF^\ T \x) dF- l (r/K(x)) . 

(5 1) (l) dr ~ dr uniformly in x 6 X, 
8F- 1 (t) 

(ii) — ^ is regularly varying at with exponent — £ — 1. 

In the homoscedastic case (3.5), Condition R3(i) amounts to 9Fu d ^ T ^ ~ 

dFu Q T ^ uniformly in x € X as r \ 0. Condition R3(ii) is a von Mises type 
condition; see Dekkers and de Haan (1989) for a detailed analysis of the 
plausibility of Condition R3(ii). 

For an intermediate sequence such that r \ and tT — > oo, define, for 
m > 1, 

(5.2) Z T = a T 0(r)-l3(T)), a T = , \ 

Consider also k sequences {rl±, . . . ,rlk}, where ii,...,^ are positive con- 
stants, and corresponding statistics {Zt{Ii)' Zt {Ik)')' ■, where, for I > 
and m > 1, 



(5.3) Z T (l) = a T (l)(P(lT)-P(lT)), a T (l) 



v'MmlT)-[3(lT)y 
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The following theorem establishes the weak limits for Zp and Zt(1)'s. Be- 
cause r \ 0, the limits depend only on the tail parameters £ and c, as in 
Theorem 4.1, but since tT — ► oo, the limits are normal, unlike in Theo- 
rem 4.1. 

Theorem 5.1 (Intermediate order regression quantiles). Suppose Con- 
ditions R1-R3 hold, and that {Y t ,X t } is an i.i.d. sequence or a stationary 
series satisfying the conditions of Lemma 9.6. Then, as tT — > oo and r \ 0, 

(5.4) z T Sz 00 = N(o,n ), ^^Q^hQxQh 1 



(m-€- l) 2 ' 

where, for £ = 0, interpret £ 2 /(m~£ — l) 2 as (mm,) -2 and 
(5.5) Q H = E[H(X)]^ 1 XX', Q x = EXX', 

(5.6fi(x) = x'c for type 2 and 3 tails, H{x) = 1 for type 1 tails. 
In addition, 

(5.7) (z T (hy, z T (i k yy A (Zoofa)', . . . , Zoofo)')' = iv(o, n), 

(5.8) EZ 00 (l i )Z 00 (lj)' = O x min(Zi,Z J )/y / Z^. 

Finally, ax(l) can be replaced by y/rlT / X' (P(mlr) — j3(W)) without affecting 
(5.4) anrf (5.7), i/iaf is, 



(5-9) a T (l)/( VTlT Ul uAen^T" 1 ^. 

' \X'(/3(mlT)-P(lT))J t=1 

Remark 5.1 (Scaling constants). It may be useful to have the same 
normalization ax in place of ay(Z) for the joint convergence. This is possible 
by noting that ax /or(0 — ► 

Remark 5.2 (Homogeneous case). In the homogeneous case (3.5), H(X) = 1, 
so the variance simplifies to 

(5.10) 00 = 2-^1^. 

Remark 5.3 (Nonregression case). Theorem 5.1 extends Theorem 3.1 
of Dekkers and de Haan (1989), which applies to univariate quantiles, to the 
case of regression quantiles. In fact, Theorem 3.1 of Dekkers and de Haan 
(1989) can be specialized from Theorem 5.1 with X = 1 and m = 2. In this 
case the variance becomes 

(m\ ^ - ^ 

{ ' (2~«-l) 2 (2«-l) 2 ' 

as Dekkers and de Haan (1989) found in their Theorem 3.1. 
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6. Quantile regression spacings and tail inference. The tail parameters 
enter the limit distributions in Theorems 4.1 and 5.1, and estimation of the 
tail index is an important problem of its own. The following results show 
how to estimate them by applying Pickands (1975) type procedures to the 
quantile regression spacings. 

Consider the following parameters and statistics: 



(6.1) p x ,x,l : 



Px,x,l 



x'{(3{mT) 


-fcr)) 


x'(/3(mr) 




x'{/3{mlT) 


-0(h)) 


x'((3(mT) 


~P(r)) 


x'0(mh) 


-Kir)) 



x'{f3{m T )-(3{T)) 



Theorem 6.1 shows that the quantile regression spacings of intermediate 
order consistently approximate the corresponding spacings in the population 
[results (i) and (ii)], which then reveal the tail parameters [results (iii) and 
(iv)]- 



Theorem 6.1 (Quantile regression spacings and tail inference). Suppose 
the conditions of Theorem 5.1 hold. Then as r \ 0,rT — > oo, for all I > 0, 
m > 1, i,ieX, 

(i) ^l> 

(ii) p x ,x,i ~ Px,i,l -> 0, px,x,l -> ■ [H{x)/H{x)}, for H{x) defined in 
Theorem 5.1, 

(iii) irp^^lnpxxj^t, 

(iv) p x x i^ x ' c uniformly in x G X (£ ^ 0), 

(v) /orvr = p'xQj^QxQj^Px, l = m = 2, if V^(p x x l -lim T p x X l ) - 

0, 

(6.2) V7T(£ rp -£)^iv(o,7r 



(2(2*- 1) In 2) 2 



Remark 6.1 {Homogeneous case). The proposed estimator £ rp consis- 
tently estimates the tail index £ in the heteroscedastic and homoscedastic 
quantile regression models, and it is a regression extension of the Pickands 
(1975) estimator. In fact, in the homoscedastic model (3.5) or when X = 1, 
tc = p' x {EXX')~ l px = e'i(EXX')~ l ei = 1, so the variance in (6.2) reduces 
to that of the canonical Pickands estimator. 
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7. An illustrative example. The set of results established here may pro- 
vide reliable and practical inference for extremal regression quantiles. To 
illustrate this possibility, the following simple example compares graphi- 
cally the conventional central asymptotic approximation, where, for fixed 
r G (0, 1) as T — > oo, 



to the extreme approximation (cf. Theorem 4.1). The comparison is based on 
the following design: r = 0.025, Y t = X[p + U t ,U t ~ Cauchy , t = 1, . . . , 500, 
where X t = (1, X'_ lt )' £ R 5 , X-u are i.i.d. Beta(3, 3) variables, and j3 = (1, 1, 1, 1, 1). 
[A more detailed simulation study is given in Chernozhukov (1999).] In this 
comparison, the parameters of the limit distribution are fixed at the true 
values. 

Figure 1 plots (a) quantiles of the simulated finite-sample distribution 
of Pi (0.025) and Pi (0.025), (b) quantiles of the simulated extreme approx- 
imation (cf. Theorem 4.1), (c) quantiles of the central approximation [cf. 
(7.1)]. Here r x T = 0.025 x 500 = 12.5. It can be seen that the extreme 
approximation accurately captures the actual sampling distribution of both 
the intercept estimator Pi (0.025) and the slope estimator (0.025). In con- 
trast, the central approximation (7.1) does not capture asymmetry and thick 
tails of the true finite sample distribution. The intermediate approximation 
(cf. Theorem 5.1), performs similarly to the central approximation and is 
not plotted. The central and intermediate approximations are expected to 
perform better for less extreme quantiles. 

8. Conclusion. The paper obtains the large sample properties of extreme 
order and intermediate order quantile regression for the class of linear quan- 
tile regression models with tails of the response variable restricted to the 
domain of minimum attraction and closed under tail equivalence across con- 
ditioning values. There are several interesting directions for future work. It 
would be important to determine the most practical and reliable inference 
procedures that can be based on the obtained limit distributions. Also, it 
would be interesting to examine estimation of the extreme conditional quan- 
tiles defined through an extrapolation of the intermediate regression quan- 
tiles. The nonregression case has been considered in Dekkers and de Haan 
(1989) and de Haan and Rootzen (1993), and the approach may prove use- 
ful in the quantile regression case. Another interesting direction would be an 
investigation of the Hill and other tail index estimators based on regression 
quantiles. 




9. Proofs. 
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quantiles at ffiritc-sampfc distribution quantiles of finite-sample distribution 

Fig. 1. Panel A plots quantiles of the finite-sample distribution of j3i(r) (horizontal axis) 
against the quantiles of the extreme approximation (cf. Theorem 4.1) and the quantiles of 
the central approximation (7.1) (vertical axis). Panel B plots quantiles of the finite-sample 
distribution of (iiir) (horizontal axis) against the quantiles of the extreme approximation 
(cf. Theorem 4.1) and the quantiles of the central approximation (7.1) (vertical axis). The 
plot is based on 10,000 simulations of the regression model described in Section 7. The 
dashed line "- - - -" denotes quantiles of the central approximation, and the dotted 

line " " denotes quantiles of the extreme approximation (this approximation almost 

coincides with " " ). The simulated quantiles of the finite-sample distribution are given 

by the 45-degree line depicted as the solid line " ." 



9.1. Properties of the linear quantile regression model under Conditions 
Rl and R2. Let 

(9.1) M = any fixed compact sub-interval of (0, 1) U (l,oo), 

(9.2) M' = any other fixed compact sub-interval of (0, 1) U (l,oo), 

(9.3) T(r') = {t:t = st',s££} where r' \ 0, 

(9.4) C = any fixed compact sub-interval of (0, oo). 

Lemma 9.1 (Properties of the linear model under Conditions Rl and 
R2). Conditions Rl and R2 imply that (for a constant vector c specified 
in Theorem 3.1): 

(i) K(x) can be represented by the forms specified in Theorem 3.1. 

(ii) ot(/3(t) — P r — &T e i) - ► for n{k) defined in Theorem 4.1. 
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(iii) Uniformly in (m, r, x) G M x T(r') x X, as r' \ 0, 



(9.5) 



/?_ 1 (r)-/?-i r 
FuHm^-FuHr) 



n(m) 



C-l 



m ^ — 1 ' 
-c-i 

m~€ — 1 ' 

C-l 



lnm ' 



/or £ < 0, 
/or £ > 0, 
/or £ = 0; 



also /3i(r) - /? lr = F~ l (r), and (^(r) - 0^ lr )/F- l (r) - c_ a /or $ ^ 0. 
(iv) Uniformly in (m,r, x) G M x T(r') x X, as r' \ 0, 



(9.6) 



H' x (j3(mr)-I3(T)) 



(x 


MX) _t n , 

m « — 1 


*/£<0 


< (x 


m « — 1 


i/e>o 


(x 


mm 


? /e=o 



(v) Uniformly in (m,r,x) G M x T(r') x X, as r' \ 0, 
x'(/3(mr)-/3(r)) 



x'c, 


i/£<0 


x'c, 


i/£>0 


1, 


i/e=o 



(9 - 7) ^(/3(mr)-/3(r)) 

(vi) Uniformly in (I, m,r,x) G M x M' x T (r') x X, as r' \ 0, 

i/e<o, 

m ■> — i 

(9.8) 



^tH(t)) 
x'(/3(mr)-/3(r)) 



7Tl^€ — 1 ' 

i-z-g 

1 — ' 
In I 



< lnm' 



i/e=o. 



Write F u G D{H^) if F u is a c.d.f. in the domain of minimum attraction 
with tail index £. Write F u G 7£ 7 (0) if F u is a regularly varying function at 
with exponent 7. 

Lemma 9.2 (Useful relations). Under Conditions Rl and R2, uniformly 
in (m,Z,r) G M x M' x T(r'), as r'\0: 

(i) Suppose Fi(z) ~ ^(z) as z \ or —00 and Fi G D(H^). Then F2 G 
D(H^); F{ 1 and F 2 _1 G ft_$(0); Fi(Ff 1 ( T )) ~ t and F^F^r)) ~ r; and 

(9.9) (Ff 1 ^) - Ff X (r)) ~ (F^H - F^r))- 

(ii) I/F[/(z|x) ~ K(x)F u (z) as z\0 or —00 for each x G X (compact), 
where K(x) G (0, 00) /or aW x G X, iaen for each x G X, 

(9.10) F^mrk) - V(r|x) - F~ l (mT/K(x)) - F- l (r/K(x)). 
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(iii) F ;l^:^ -> */£<0, £j£ */£>0, */£ = 0;/ O r 
F u eD(Ht). 

(iv) — ^^±tt% ^ T ' > — > lnm if F u £ D(H$), where a(-) is i/ie auxiliary 

<H-Tu (, r )) 

function defined in (3.1). 

Proof. Results (i), (iii) and (iv) are well known [cf. de Haan (1984) 
and Resnick (1987), Chapters 1 and 2]. Result (ii) holds from (i) pointwise 
in x. □ 

Proof OF Lemma 9.1. Claim (i): The proof consists of two steps, where 
we use notation (£, M,T(t'),t') as defined in (9.1)-(9.4). 

Step 1. In this step all of the results hold uniformly in (m,T,x) £ M x 
T(r') x X as r' \ 0, but we shall suppress this qualification for notational 
simplicity. By construction in Condition Rl, x'(/3(r) — @ r ) = F^ 1 {t\x) and 
li' x (J3(r) - r ) = Fu\t\iix) = F-\t). Hence, 

rqi1 s p , s _ {x- Hx)'(f3(T) - 0r) _ F^(r\x) - F~\r) 
(9.11; B T[x ,m)- M p [mr) _ m) ~ F -i {mT) _ F -i {T y 

We would like to show that, for each x £ X, 

r (i/K(x))-t - i 



(9.12) B T (x,m) -> B(x,m) 



m ? — 1 
1 - (!/*(*))-* 



if e < o, 
if e > o, 



1 — m € 
mm 



We will show (9.12) for the case £ < only; others follow similarly. Fix any 
x £ X. By Condition Rl and Lemma 9.2(i), Fu(F^ 1 (t\x)\x) ~ t. Hence, 
by Condition Rl, K{x) ■ F U {F^ (r\x)) ~ r as t 1 \ 0. Therefore, there exist 
sequences of constants K T (x) and K' T {x) such that 

F-\r/K T {x)) < Fv\t\x) < F-\t/K' t {x)) 

(9.13) 

where K T {x) — > K(x) and K T {x) — > K{x). 



Therefore, 

F-\t/K t (x))-F-\t) 



< B T (x, m) 



FuHm^-FuHr) 
(9.14) 

F-Hr/^^))-^ 1 ^) 



< 
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Suppose that K(x) ^ 1. By Lemma 9.2 (iii) , 

(9.15) r- r— > -, — = B(x,m), 

V ' Fu\mT)-Fu\T) m-f-1 v ; ' 

and, likewise, conclude for K' T (x) in place of K T (x). Therefore, B T (x,m) — > 
B(x,m) when if (a;) 7^ 1. To show that B T (x,m) — > B(x,m) also holds for 
K(x) = 1 with B(x,m) = 0, let k' and ft" be any positive constants such 
that k' < 1 < k" . By monotonicity of the quantile function, for all sufficiently 
small t' 



(9.16) 



F-\t/k") - F~\t) ^ F-\r/K T {x)) - F~\r) 
FuHmr) - Fu X (r) ~ F^ 1 (mr) - F^ 1 (t) 
< F-\t/k')-F-\t) 



Fu\m T )-Fu\T) ' 

By Lemma 2 (iii) , as r' \ 0, the upper and lower bounds in (9.16) converge 
to 

(9.17) - 1 and ~\ 

m ^ — 1 m ^ — 1 

If in (9.17) we let k! , k" — > 1, then expressions in (9.17) — > 0. Therefore, since 
k' and k" can be chosen arbitrarily close to 1, it follows from (9.16) and 

(9.17) that F ^1 { I/ Kt{x)) ~ F ^\ (t) _^ as r' \ 0. Likewise, conclude for K'Jx) 

F u (mT)-F u (r) TV 

in place of if r (x). Therefore, B T (x,m) — > B(x,m) = when if(x) = 1. 

Step 2. By Step 1, for each x G X, uniformly in (m, r) G M x T(r') 
as r' \ 0, 

(9.18) B T (x,m)- — — — — >B{x,m). 

/j.' x {J3{mT) - P(t)) 

Since (a) B(x, m) is finite and continuous in x over X by conditions imposed 
on K(x) in Condition Rl, and (b) B T (x,m) is linear in x, the relation (9.18) 
also holds uniformly in i£X. Recall that (x — fJ,x)i = 0. Since (x — (J,x)-i 
ranges over a nondegenerate subset of R d-1 , (9.18) implies 

(9 19) P-i(r)-P- lr 

uniformly in (m,r) G M x T(r') as r' \ 0, where //(m) is some vector of fi- 
nite constants. Hence, B(x,m) is affine in (x — fix)- Note also that {x — ^x) 
if). ■ :'. Therefore, if £ = 0, B(x,m) affine and B(x,m) = — \n.K(x)/\n.m 

imply = e( x -^)' c = e x '-^~ l = e x ' c for all x iff ci = 0. When £ < 0, 

B(x,m) affine and B(x,m) = (K(x)^ — l)/(m - ^ — 1) imply K(x) = (1 + 
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{x — jUx/c) 1 ^; which equals (x'c) 1 ^ for all x iff Ci = 1. Likewise, conclude 
for £ > 0. This completes the proof of claim (i). 

Claim (hi) follows directly from (9.19) and the preceding paragraph. 

Claim (iv) is verified by substituting the forms of K{x) found above into 
(9.18). 

Claim (v) holds pointwise in x by Lemma 9.2(h) and (hi). Since the left- 
hand side in (9.7) is linear in x and X is compact, it also holds uniformly 
in x 6 X. 

A combination of Lemma 9.2(iii) with claim (v) implies claim (vi). 
Claim (ii). If £ < 0, by claim (hi) uniformly in k in any compact subset 
of (0, oo) as T — > oo, 

or -A 

(9.20) 

~ a T cF^ ( h.) = cF" 1 ( ^ ) If~' ( ± ) - k-t< 



rj~] J 11 1 rj~] J J 11 \ rj-i 

since by Lemma 9.2(i) F' 1 G TZ^(0); similarly, if £ > 0, 
or (p(!p) -A 

(9.21) 

~ -a T cF^ (A) = -cF^ 1 (A) IF'' (I) - -AT«c. 

If £ = 0, by ci = 0, Lemma 9.2(i), (iv) and claim (hi) [using m = e in /x(m)], 
we have that uniformly in k in any compact subset of (0, oo), 

a T (p(^) -Pr-brei) 



T 



1 



, , <F-\1/T)) 
(9 22) 

x[.(^(4)-^(i)) + ^(i)-^(i 

— >■ cine + ei ln/c = c + ei lnfc. □ 

9.2. Proof of Theorem 3.1. Follows from Lemma 9.1 (i) . □ 

9.3. Proof of Theorem 4.1. 

Pari 1. Referring to (2.4), notice that Z?{k) defined in (4.1) solves 



(9.23) Zr(fc) Gargmin 



1 T 



^ r (a T (C/ t -& T )-X t Y) 
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[where z = ay(/3 — f3 r — &T e i)]- Rearranging terms, the objective function 
becomes 



(9.24) 



1 



T 

-tTX'z - t(a T (Ut - b T ) < X' t z)(a T (U t - b T ) - X' t z) 
t=i 



+ t - ^ariUt - b T ) 
t=i 

Mutiply (9.24) by ax and subtract 

T T 

t{a T {U t - b T ) < -5){-5 - a T (U t - b T )) + ]T ra T (U t - b T ) 
t=i t=i 

(9.25) 

for some 5 > 0, 

which does not affect optimization, and denote the new objective function 
Q T (z,k): 

T 

(9.26) Q T (z, k) = -tTX'z + ]T h{a T {U t - b T ),X' t z), 

t=i 

where 

(9.27) l s (u,v) = l(u<v)(v-u)-l(u<-5)(-5-u) for 5 > 0. 

Since it is a sum of convex functions in z, QT(z,k) is convex in z. The 
transformations make (as shown later) Qt a continuous functional of the 
point process N: 

(9.28) Q T (z,k) = -TTX'z+ f l s (j,x'z)dN(j,x), 

JE 

where the point process 

(9.29) N(-) = J2HMUt-b T ),X t )e-} 

t<T 

is taken to be a random element of the metric space M P (E) of point processes 
defined on the measure space (E,£) and equipped with the metric induced 
by the topology of vague convergence [cf. Resnick (1987)]. 

It will suffice to restrict our attention to underlying measure spaces (E,£) 
of the form 

Ei = [— oo, oo) x X, for type 1 tails, 

(9.30) E = { E 2 = [-oo, 0) x X, for type 2 tails, 
E3 = [0, 00) x X, for type 3 tails, 



EXTREMAL QUANTILE REGRESSION 



21 



with cr-algebra £ generated by the open sets of E. The topology on E\, 
E2 and E3 is assumed to be standard so that, for example, [—00, a] x X is 
compact in E2 for a < and in E\ for any a < 00. 

Part 2 shows that, for type 1 and 3 tails, the marginal weak limit of Qt 
is a finite convex function in z: 

(9.31) Q OD (z,k) = -kfi' x z+ f l 6 (j,x'z)dN(j,x), zeR d , 

JE 

where N is the Poisson point process defined in the statement of Theo- 
rem 4.1. 

Part 2 also shows that, for type 2 tails, the marginal weak limit of Qt is 
a finite convex function in z: 

Qoo(z,k) = -k^i' x z + / l s (j,x' z)dN(j,x) 

JE 

(9.32) 



for z G Zn = s z G 



d . 



maxx'z < >, 

x£X J 



where N is the Poisson point process defined in the statement of Theorem 
4.1, and 

(9.33) Qoo(z,k) = +00 for z£Z P = { z G If^ : maxx'z > 

The function Qoo(z, k) is convex and h(j, x'z) = (j — x'z) + > when j > —5. 
Hence, Qoo (z, k) is also well defined over entire Z = {z G M. d : max Ig x x'z < 
0}, although it may equal +00 at z : max^gx x'z = 0. Also, note that Zjy U Zp 
is dense in M. d . 

Recall the convexity lemma [cf. Geyer (1996) and Knight (1999)], which 
states: Suppose (i) a sequence of convex lower-semicontinous functions Qt '■ R d 
M. marginally converges to Qoo — > ]R over a dense subset of M. d , (ii) Qoo 
is finite over a nonempty open set Zq, and (iii) Qoo is uniquely minimized 
at a random vector Z^. Then any argmin of Qt, denoted Zt, converges in 
distribution to Z^. 

We showed (i) and (ii) in Step 2, and we assumed (iii). (A sufficient 
condition for uniqueness is given in Remark 4.4.) Hence, application of the 
convexity lemma to our case gives 

(9.34) Z T (k) —> Z OQ (k) = argmin Qoo (z, k). 



Note also that, for type 2, tails, the argmin Zoo (£;) necessarily belongs to Z = 
{z G M. d : max^gx x'z < 0}. This gives us the conclusion stated in Theorem 
4.1 upon noting that Qoo(z,k) differs from the limit objective function of 
Theorem 4.1 only by a finite random variable that does not depend on z. 
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Part 2. It remains to verify that (I) there exists a nonempty open set Zq 
such that Qoo{z,k) is finite a.s. for all z £ Zq and (II) Qoo(-,k) is, indeed, 
the weak marginal limit of Qt(~, k). 

To show (I), when tails are of type 1 and 3, choose Zq as any open bounded 
subset of M. d ; when tails are of type 2, additionally require Zq C Zjq for each 
/ (possible by compactness of X). For any z G Zq, (u,x) \— > ls(u,x'z) is in 
Ck{E) (continuous functions on E vanishing outside a compact set K) by 
the arguments in (II). This implies J E lg(u,x'z)dN(u,x) is finite a.s., since 
N G M p (E). 

To show (II), Qoo(',k) is the marginal weak limit of {Qr(-,fc)} iff for any 
finite collection (zj,j = (Q T (zj,k),j = ->• (Qoo(zj,k),j = 

!,...,/). Since X'zj — ► n' x Zj and tT — > k > 0, it remains to verify 



(9.35) 

d 



(J^ls{u,x'zj)dN{u,x),j = l,...,l 

(J^l s (u,x'zj)dN(u,x),j = l,...,lj. 



Define the mapping T : M P (E) -> W (for E = E x , E 2 or £3) by 
(9.36) T:N^(^J l s (u, x' Zj ) dN(u, x), j = 1, . . . , Ij . 

(a) Consider type 1 tails and set E = E\. The map (u,x) i— > lg(u,x'zj) 
is in Ck{E\) (continuous functions on E\ vanishing outside a compact 
set K), since by construction it is continuous on E\ and vanishes outside 
K = [— oo,max(«;, —5)] x X, where k = niaXxex.,ze{zi,...,zi} x ' z - K * s compact 
in Ei since k < oo by Condition R2. Hence, N i— > T(N) is continuous from 

Afp(JSi) to R l . Thus, N => N in M p (£i) implies T(N) 4 T(N). 

(b) Consider type 3 tails and set E = £3. The map (u,x) 1— > l(u,x'zj) is 
in Ck{Ez): by construction, it is continuous on £3 and vanishes outside 
K = [0,max(K,0)] x X, where k = max,j, g x,ze{.2i,...,z;} x ' z - K ^ s compact in 
£ 3 since k < 00 by Condition R2. Therefore, N h-> T(N) is continuous from 

M P (E 3 ) to R l . Hence, N ^ N in M P (E 3 ) implies T(N) 4. T(N). 

(c) Consider type 2 tails and set E = E2. (c)(i) shows that (9.35) holds 
on Zfq, while (c) (ii) shows that Q n (z) — > 00 for any z G i?p. [Sets i?Ar and 
2p are defined in (9.32) and (9.33).] 

(i) The map (u,x) 1— > lg(u,x'z) is in Ck{E2) if 2 G -2^v, since, by con- 
struction, it is continuous on £2 and vanishes outside £ = [— 00, max(fi, —S)] x 
X, where k = max xe x.,ze{z 1 ,...,zi} x> z - K is compact in E2 since k < if 
z£Z N . Hence, N i-> T(N) is continuous from M p (E 2 ) to R l . Then N N 
in M P (E 2 ) implies T(N) A T(N). 
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(ii) Observe that I = J2t<Th(aTUt,X' t z)l(a T Ut < -S)=O p (l) by the 
argument in (i). Observe that l$(u,v) = (v — u) + > for any u > —S. Hence, 

l s (u, v) = t(-6 < u < v)(v - u) > l(-5 <u<0,v>e)e 

(9.37) 

for any u > —5 and any e > 0. 

For a given z € -2p, since X equals the support of X, max^x^'^ > 
implies that X'z > e occurs with positive probability for some e > 0. Fix 
this e. Since 1/ar — > oo for type 2 tails, P(—5/clt < ?7 < 0, X'z > e) — ► 7r = 
P(f7 < 0,X'z > e) > 0. vr > because inf^xP^ < 0|X = x) > for type 
2 tails by assumptions in Condition Rl. Therefore, = J2t<T^-i~^/ a T < 
17< < Q,X[z > e)e — > +oo in R. Since Q T {z,k) > -kfi' x z + I+ II by (9.37), 
Qt(z, k) — > +oo for any 2 £ i?p. 

Pari 3. By Lemma 9.1(h), ap(/3(r) — /3 r — ftpei) -^rj(k). Hence, Z^(k) — > 

Part 4. (Z T (kj)',j = 1, ... ,0' G axgmin zeR dxi[Qr(^i,^i)H r-Qr(^,fc«)], 

for 2 = (zi, . . . , zj). Since this objective is a sum of objective functions in 
Parts 1 and 2, the previous derivation of the marginal limit and subsequent 
arguments apply very similarly to Qt(zi, &i) + • • ■ + Qr(zi,ki) to conclude 
that (Z T (kj)',j = 1, ...,/)' (ZooikjYJ = 1, . . . ,1)' = axgmm z&SL dxi[Q 00 (z 1 ,k{) + 
■ ■ ■ + Qoo{zuh)Y □ 



9.4. Weak limit o/N. 



Lemma 9.3 [Resnick (1987), Proposition 3.22]. Suppose N is a simple 
point process in M p (E), T is a basis of relatively compact open sets such 
that T is closed under finite unions and intersections and, for any F £ T, 
P(N(<9P) = 0) = 1. Then N => N m M p (E) if for all FeT, 

(9.38) lim P[N(P) = 0] = P[N(P) = 0], 

T—fOO 

(9.39) lim EN(F) = EN(F) < oo. 

T — >oo 

Remark 9.1. In our case, T consists of finite unions and intersections 
of bounded open rectangles in E\, E2 and £"3 [cf. Resnick (1987)]. 



We impose Meyer (1973) conditions on the "rare" events Af(F) = {w £ 
n:(a T (U t -b T ),X t )eF}. 
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Lemma 9.4 (Poisson limits under Meyer mixing conditions). Suppose 
that, for any F G T, the triangular sequence of events {(A[(F),t <T),T > 
1} is stationary and a-mixing with mixing coefficient ar(-), condition (9.39) 
holds, and the Meyer type condition holds: There exist sequences of integers 
iPn,n > 1); {fln-.n > 1); {t n = n{Pn + Qn)i n — 1) such that as n — ► oo, for 
some r > 0, (a) n r a tn (q n ) ->• 0, (b) g n /p„ ->• 0, p n +i/p n -> 1, and (c) J Pn = 
Ef^CPn - n A*^ (F)) = o(l/n). TTien in M p (£), N N, a 

Poisson point process with mean measure m : m(F) = lim^oo EN(F). 

PROOF. For any F : m(F) > 0, lim T ^oo -P[N(F) = 0] = P[N(F) = 0] = 
e -^(^) ) by Meyer (1973). The same also holds for F : m(F) = 0, since EN(F) -> 
implies P(N(F) = 0) — > 1. Conclude by Lemma 9.3. □ 

Remark 9.2. Condition Ip n = o(l/n) prevents clusters of "rare" events 
A[(F), eliminating compound Poisson processes as limits. 

Lemma 9.5 (Limit N under Conditions Rl and R2). Suppose Conditions 
Rl and R2 hold and that (Yt,Xt) is an i.i.d. or stationary strongly mixing 
sequence that satisfies the conditions of Lemma 9.4 with (ar, &r) defined 
in (4.2). Then: 

(i) N =>■ N in M p (E), where E = E\,E2 and E% for tails of types 1, 
2 and 3, respectively. N is a Poisson point process with mean intensity 
measure: m(du,dx) = K(x) x dh(u) x dFx(x), where h(u) = e u for type 1, 
h{u) = (— u)~ 1 /^ for type 2, and h{u) = u -1 ^ for type 3 tails. 

(ii) Points (Jj,Afj) o/N Ziave i/ie representation (Ji,X{,i > 1) = (h~ 1 (Ti/K(Xi)), Xi,i > 
1), where /i -1 is £/ie inverse of h, r$ = E\ + ■ • • + £j, i > 1 are i.i.d. 

standard exponential ), and {X{\ are i.i.d. r.v.s with law Fx, independent of 

Proof. To show (i), by Lemmas 9.3 and 9.4 the proof reduces to ver- 
ifying limr£'N(F) = m(F) for all F in T. For example, as in Leadbetter, 
Lindgren and Rootzen [(1983), page 103], it suffices to consider F of the form 
F = Uj=i Fj, where Fj = (lj ,Uj) x Xj , where F±, . . . , F^ are nonover lapping, 
nonempty subsets of E, and Xi, . . . , X& are intersections of open bounded 
rectangles of M. d with X. Then by the stationarity and Fj's nonover lapping, 

T 

EN(F) = E^iiMUt - br),X t ) € F] 
t=i 

k 

= J2TP[(a T (U - b T ),X) e (l jlUj ) x X,] 



(9.40) 
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k 

J2T-E(P[(a T (U-b T ),X)G(l j ,u j )xX j \X]) 
j'=i 
k 

Y^T ■ E(P[(a T (U - br) G {lj,Uj)\X\ -t[Xe X,-]) 
j'=i 



= ^T- J B((F c/ [n i /a T + 6T|X] 
i=i 

-F[/^/«t + 6t|X])-1[^gX,]). 
Suppose that lj > —oo for all j. Then as T — > oo, 



(9.41) ~ E((K(X)[h( Uj ) - h(l 3 )])t[X G X,-]) 

i=i 

= V / K{x) dh(u) x dF x (x) 

k 

= J2m(Fj) = m(F). 
i=i 

In (9.41), ~ follows from two observations. First, the assumed tail equiva- 
lence Condition Rl implies 

(9.42) Fu\l/aT + br\x\ ^ uniformly in x G X, 

Fu[l/a T + or J 

since by definition of (ar,or) given in (4.2), Z/ar + or \ F~ 1 (0) = or = 
— oo for any I G (— oo, oo) for type 1 tails, any I G (— oo, 0) for type 2 tails, and 
/ G [0, oo) for type 3 tails. Second, for example, as in Leadbetter, Lindgren 
and Rootzen [(1983), page 103], the definition of the tail types (3.1) implies 
that (a) for tails of type 2, for any I < 0, TF u (l/a T ) = TF u {-lF~ l {^)) ~ 
{-Vj-ytTF^F' 1 ^)) ~ H)~ 1/C , (b) for tails of type 3, for any I > 0, 
TF u {l/a T ) = TF u {lF-\±)) ~ r^TF^- 1 ^)) ~ and (c) for tails 

of type 1, for any / G M, TF u (l/a T + b T ) = TF ti (//a(F~ 1 (^)) + F" 1 ^)) 
e z TF tl (F~ 1 (i)) ~ e l . 
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On the other hand, if for some j's, lj = —oo for type 1 or 2 tails, then we 
have the replacement TFu[lj / ax + b T \X] =0 in (9.40), and (9.41) follows 
similarly. 

To show (ii), construct a Poisson random measure (PRM) with the given m(-). 
First, define a canonical homogeneous PRM Ni with points {Ti,i > 1}. It 
has the mean measure m\(du) = du on [0,oo), for example, Resnick (1987). 
Second, by Proposition 3.8 in Resnick (1987), the composed point process N2 
with points {rj,^fj} is PRM with mean measure rri2(du, dx) = du x dFx{x) 
on [0,oo) x X, because {X{\ are i.i.d. and are independent of Fi- 
nally, the point process N with the transformed points {T(Ti,Xi)}, where 
T:(u,x) 1— > (h~ 1 (u/K(x)),x), is PRM with the desired mean measure on 
£xX, m(dj,dx) = m 2 o T~ 1 (dj,dx) = K(x) x dh(j) x dF x (x), by Proposi- 
tion 3.7 in Resnick (1987). □ 

9.5. Proof of Lemma 9.3. Step 1 outlines the overall proof using stan- 
dard convexity arguments, while the main Step 2 invokes regular variation 
assumptions on the conditional quantile density to demonstrate a quadratic 
approximation of the criterion function. Step 3 shows joint convergence of 
several regression quantile statistics. Step 4 demonstrates that ax can be 
estimated consistently. 

Step 1. With reference to (2.4), notice that Z T = OT0 (t) - /3(r)), de- 
fined in (5.2), minimizes 




Using Knight's identity, 

p T (u - v) - p T {u) 

(9.44) 

= -u(r-l(«<0))+ / (l(u<*)-l(u<0))ds, 




write, a.s., 



Qt{z,t) = Wt{t)'z + G t (z,t) 1 
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By Lemma 9.6, W T {j) 4 W = N(0,EXX'), and by Step 2 




where Q H = H(x) = x'c for type 2 and 3 tails, and H(x) = 

1 for type 1 tails. Thus, the weak marginal limit of Qt{z) is given by 



We have that EXX' is positive definite and by Theorem 3.1 that < 
H(X) < c < oo for some constant c. Thus, Qh is finite and Qh is pos- 
itive definite. Indeed, z'Q H z = E(X'z) 2 /H(X) = for some z 7^ if and 
only if X'z = a.s., which contradicts EXX' positive definite. Thus, the 
marginal limit Qoo{z) is uniquely minimized at = ( )Q~hW = 

N (®, (m 4-i)2 Qh 1 EXX'Qh 1 ). By the convexity lemma [e.g., Geyer (1996) 
and Knight (1999)], Z T 4 Z M . 

Step 2. This step demonstrates that as r \ 0, 



while Lemma 9.6 shows that Y&i(Gt(z,t)) — > 0. In what follows, Ft, ft 
and £Jt denote Fjj(-\Xt), fu(-\X t ) and i?[-|X f ], respectively, where U is the 
auxiliary error constructed in Condition Rl. 



(9.47) 




(9.48) 




Since 



Gt(z,t) 



(9.49) 





we have 
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(9.50) =T-E(l- (X' t z^ 2 ft ^ F t~ 1( - T ^ 



2 ax ■ v tT 

~ E [r^ ■ tmftHt)})-! 



.2 v * ' H{X) -i 
1 rn-C _ i 

Equality (1) is by the definition of ax and a Taylor expansion. Indeed, since 
tT — > oo uniformly over s in any compact subset of R, 

(9.51) s/a T = s ■ (F-\mr) - F-\r))/y^T = o{F~\mT) - F~\t)). 

To show equivalence (2), it suffices to prove that, for any sequence v T = 
o(F~ 1 (mr) - i ? ~ 1 ( r )) witn m > 1 as r \ 0, 

(9.52) ft(Ff\T) + v T ) ~ MFfHr)) uniformly in t. 

This will be shown by using the assumption made in Condition R3, which is 
that uniformly in t, l/f^Ff 1 ^)) ~ dF- 1 {T/K{X t ))/dT, where dF- x {r)/dr 
is regularly varying with index — £ — 1. 

To be clear, let us first show (9.52) for the special case of ft = f u and 
F t -\r) = F-\r): 

(9-53) UK\r)+v T ) ~ /uC^Ct)). 

By the regular variation property of dF~ l (r) / dr = l// n (F~ 1 (r)), locally 
uniformly in I [uniformly in I in any compact subset of (0, oo)], 

(9.54) fu{F-\lr))^l^f u {F-\r)). 
That is, locally uniformly in I, 

(9.55) UF-\t) + [F-\lr) - F-\t))) ~ ^ +1 /«(F~V))- 
Hence, for any l T — ► 1, 

(9.56) /ufo-V) + [F- 1 ^) - F" 1 ^)]) ~ A^F" 1 ^)). 

Hence, for any sequence « T = o([i ?_1 (mr) — F _1 (r)]) with m > 1 as r \ 0, 
(9-57) f u (F- l {T)+v T )~f u {F-\T)), 

because for any such {v T }, in view of Lemma 9.2(iii), we can choose a se- 
quence {l T } such that {v T } = {[F~ (I t t) — F~ 1 (t)]} and l T — ► 1 as r \ 0. 

Next, let us strengthen the claim (9.53) to (9.52), completing the proof 
of equivalence (2) in (9.50). Since 
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(a) l/ft{Fr\r))^dF-\T/K{X t ))/dT = l/{K{X t )} u [F-\T/K{X t ))}} 
uniformly in t by Condition R3, and 

(b) f u {F-\lr/K)) ~ {l/K)^UF-\r)) ~ {l)^ f u (F^{r/K)), locally 
uniformly in / and uniformly in K G {-K^(a^) G X} [compact by assump- 
tions on K(-) and X], by (9.54) we have that locally uniformly in I and 
uniformly in t, 

(9-58) MFTHW-I^MFTHt))- 

Repeating the steps (9.55)-(9.57) with f t (F^ l (W)) in place of /^(F^^/r)), 
we obtain the required conclusion (9.52). 

The equivalence (3) in (9.50) can be shown as follows. By (a), uniformly 
in t, 

(g 59) F-\mr)-F-\T) F" 1 (mr) - F^ (r) 

l ' ' r{f t [Fr\r)))^ ~ T{K(X t )f u [Er\T/K(X t ))])-i- 

By (b) we have that uniformly in t, 

(9.60) f u [F~\r/K(X t ))] ~ (1/^(X,))« +1 • /^(r))- 

Putting (9.59) and (9.60) together, we have uniformly in t, 

F-\mr)-F-\r) 1 F^H - F'^r) 



(9.61) 
(9.62) 



rC/tlJT'Cr)])- 1 *W rC/uIi^ 1 ^)])- 1 

1 F^H-F^M 



ff(X t ) rC/u^Cr)])-! ' 

where H(X t ) = X[c for £ / and H(X t ) = 1 for £ = 0. Finally, by the regular 
variation property, (9.54), 

( 9 63) F^(mr)-F^(r) fu[F-\r)\ 

1 ' ' riUFuHr)})' 1 Ji MFuHst)) 

(9.64) ~ / s - * -1 ds 



i 

m -£ — i 

(9.65) = — (him if £ = 0). 

-? 

Putting (9.61)-(9.65) together gives (3) in (9.50). 

Step 3. For Z T (l) defined in (5.3), notice that (Z T (k),i = 1, ...,&) G 

arg min 2gK£ ixfc [Qt{zi, 1\t) H ^Qr(z k , l k r)] = argmin 2gR dxfe [Ya=i Wt(tZ»)' > 

z.j + Gt^jtZj)] for z = (zj, . . . , z' k )' , where the functions Qt(-, •), Wt(-) and 
Gt(-,-) are defined in (9.45). Since this objective function is a sum of the 
objective functions in the preceding steps, it retains the properties of the el- 
ements summed. Therefore, the previous argument applies to conclude that 
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the marginal limit of this objective function is given by Y^t=\ W(li)'zi + 
G(z u k), where (W(k),i < k) = iV(0,S) with EW{li)W(ljY = EXX' minfalj)/ 
and, by calculations that are identical to those in the preceding section, 
G(z,li) = G(z) = ^ • • z'Qhz. The limit objective function is mini- 

mized at (Zoo(/j),f < k) = ( m 4_i - Qn lw (h),i < k). Therefore, (Z T (h),i < 
k) -> (Z^lij.i < k). 

Step 4. It suffices to prove the result for 1 = 1. Then 

X'0(mr)-(3(r)) 



(9.66) 



/4(/3(mr)-/3(r)) 

_ X'0{mr) -/3(mr)) 



X'0(t) - P(t)) X'(/3(mr)~f3(r)) P 



/4(/3(mr) - (3(t)) ^ x (p(mr) - (3(r)) 

since the first two elements on the right-hand side are O p (-^=) = o p (l) by 
the first part of Theorem 5.1. □ 

9.6. CLTforW T (T) and LLN for G t (z,t). 

Lemma 9.6 (CLT and LLN). Let {Y^X,}*.^ be an i.i.d. or a stationary 
a-mixing sequence. The following statements are true for Wt(-) and Gt(-, ■), 
defined in (9.45), as r \ and tT — > oo; 

(i) Suppose mixing coefficients satisfy ay = 0(j~^) with (f)> 2, and for 
any K sufficiently close to + or —oo, uniformly in t and s > 1, and some 
C>0[P t denotes P{-\T t )^t = <r({Yj,Xj}t£)] 

(9.67) P t (U t < K, U t+S < K) < CP t (U t < K) 2 . 
Then for Qjfiy finite collection of positive constants . . . , Im? 

{w T (Th)>, w T (Ti m yy 4 (w(hy, w(i k yy = n(o, s) 

with EW(li)W(lj)' = EXX'mintkJ^/^/kl]. 

(ii) //, in addition, ay = 0(j~^) with (f> > for < 7 < 1 and t 1 ^ 2 ^ /T - 
0, then 

(9.68) Var(G T (z,r)) -> 0. 

Remark 9.3. In the i.i.d. case the claim (i) simply follows from the 
Lindeberg-Feller CLT. In the dependent case condition (9.67) requires that 



EXTREMAL QUANTILE REGRESSION 



31 



the extremal events should not cluster, which leads to the same limits as 
under i.i.d. sampling. This condition may possibly be refined along the lines 
of Watts, Rootzen and Leadbetter (1982), who dealt with the nonregression 
case. (9.67) is analogous to the no-clustering conditions of Robinson [(1983), 
A7.4, page 191] used in the context of kernel estimation. 

Proof of Lemma 9.6. To show (i), {WrCr^)'^ < m}' suits the CLT of 
Robinson (1983), which implies the same weak limit as under i.i.d. sampling. 
His conditions A7.1 (with q = 0), A7.2 and A7.3 are satisfied automatically. 
The assumed above mixing condition implies Y^=iJ a j < 00 ■> which implies 
his condition A3. 3. Last, condition (9.67) immediately implies his condition 
A7.4. 

To show (ii), suppress r. Then from (9.49), 

Var(G T (z)) = r" 1 (Var(Ai) + 2 £ -j- Cov(Ai, A 1+fc )J , 

for 

X t = f XtZ [t(Y t - X' t P(T) < a/or) - t{Y t - X' t /3(r) < 0)] ds. 
Jo 

By Condition R2, \X t \ < Ko\^ t \, for 

fit = (l(Y t - X' t P(r) < X[z/a T ) - l(Y t - X' t 0(r) < 0)) 
and some K$ < oo. Hence, 

Var(Ai) = O(EXl) ( = } O(E^) ( = } 0(E\m\) 

(9.69) 

^ 0{f u (F-\T))a- T l ) = 0(JV/T), 

where (1) is by |A*| < Ko\^ t \, (2) is by \ fi t \ G {0, 1}, and (3) is by the calcula- 
tion in (9.50). Thus, in the i.i.d. case \&t(Gt{z)) =o(l) follows from (9.69) 
and tT — > oo. Also, for all s and some positive constants K\, K2, K3, K4, 

(9.70) I Cov(A!, A 1+s )| < ^(^"^[^lAan^^lAin 1 ^) 

(9.71) <K 2 (al^[E\^ r [E\^n 

(9.72) <K 3 (al^[E\^\D 

(9-73) <K 4 (al^(^y /2 y 

where 1/p+l/r = 7 G (0, 1), p > 1. Here (9.70) follows by Ibragimov's mixing 
inequality [e.g., Davidson (1994)], (9.71) follows by the previous bound |Aj| < 
K \nt\ and (9.72) follows by \fi t \ G {0,1}, while (9.73) follows by (9.69). So 
Var(GT(z)) = o(l) by the condition on the mixing coefficients. □ 
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9.7. Proof of Theorem 6.1. Theorem 6.1 is a direct corollary of Theo- 
rem 5.1 and Lemma 9.1. Proof of claim (i) follows similarly to the proof 
in (9.66). Claim (i) implies claims (ii)-(iv), using the properties (v) and (vi) 
in Lemma 9.1. Uniformity in x in claim (iv) follows from the linearity of 
p x x i in x. Finally, claim (v) follows from Lemma 3 by the delta method. 

9.8. Tightness of Z OQ (k). This section provides primitive conditions for 
tightness of Z OQ (k), which is assumed in the statement of Theorem 4.1 and 
the conditions of uniqueness given in Remark 4.4. 

We impose the design condition of Portnoy and Jureckova (1999), who 
used it for the case tT — ► and show its plausibility on page 233, for example, 
when EXX 1 > 0. Their proof of tightness is not applicable here, so we have 
it. 



Condition PJ. Let Fx denote the distribution function of X. There 
are a finite integer /, a collection of sets {Ri, . ■ . , Ri} and positive constants 
S and n such that: 

(a) for each u £ {u: \\u\\ > l,u± > 0}, there is Rji u ) such that x'u > S\\u\\ 
for all x £ Rj( u )t 

(b) J R dF x >v>0 for all j = 1, . . . ,1. 

Lemma 9.7. If Conditions RI, R2 and PJ hold, then Z OQ (k) is finite a.s. 
PROOF. Choose = (z(, . . . , z d )' E M. d such that 

(9.74) Q 00 (z f ,k) = -kfi' x z f + f (x'z f -u) + d-N(u,x) = O p {l), 

JE 

which is possible, as shown in the proof of Theorem 4.1. 

Consider a closed ball B(M) with radius M and center z* , and let z(k) = 
zf ' + d(k)v(k), where v(k) = (v\(k),..., Vd{k))' is a direction vector with unity 
norm ||i>(A;)|| = 1 and 5(k) > M. By convexity in z, 

(9.75) _ (Q«>(*(fc ),k) - Qoo(z f ,k)) > Qoo(z (k),k) -Q ao (z f ,k), 
5{k) 

where z*(k) is a point of boundary of B{M) on the line connecting z(k) and 
z' . We will show that, for any K and e > 0, there is large M such that 



(9.76) P[ inf Q 00 (z*(k),k)> K) >l-e. 

\v(k):\\v(k)\\=l J 

(9.76) and (9.74) imply (9.75) > C > with probability arbitrarily close to 
1 for M sufficiently large, meaning that Zoo^k) S B{M) with probability 
arbitrarily close to 1 for M sufficiently large, that is, Z QC (k) = O p (l). 
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Thus, it remains to show (9.76). Since [ix = (1,0, ... , 0)', /jf x z*(k) = z[ + 
vi(k) ■ M. Hence, it suffices to show that, for any e > and any large K > 0, 



-vx(k)-k-M+ l_{x'z* (k) - u) + dN{u, x)>K 

(9.77) 



E 



w.pr. > 1 — e, for large enough M, 



and, therefore, we establish (9.76). We have by Condition PJ that, for some 
Rj(v) with j(v) E {1,...,/}, 



(x'z*(k) -u) + dN(u,x) 

E 



(9.78) > / (x'z*(k)-u) + d~N(u,x) 

> N(([-oo, k] x R j(v) ) f]E)x (5M -k- k') + , 

where k € R is a constant to be determined later and that does not depend 
on v(k) and k' = max xg x \x'z* |. 

Note that for any region X such that / x dFx > Tj > and any K\ > and 
e > 0, there is a sufficiently large K2 such that 

(9.79) N(([-oo, k 2 ] x X) n E) > m w.pr. > 1 - e. 
Hence, by (9.79) we can select k large enough so that 

N(([-oo, k] x Rj) HE)> -^±11 

(9.80) 

for all j G {1, ...,/} w.pr. > 1 — e, 



so that w.pr. > 1 — e, 
(9.81) 



Vl (k)-k-M+ / (x'z*(k) -u) + dN(u,x) 

JE 

. (SM — k — k') + 
>-k-M + {k + l) — 



5 

Now set M sufficiently large to obtain (9.77). □ 
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